Tper Hcaeser Pidi Implementation of the Standard I-vector System for the Kaldi Speech Recognition Toolkit

نویسندگان

Srikanth Madikeri

Subhadeep Dey

Petr Motlicek

Marc Ferras

Srikanth R. Madikeri

چکیده

This report describes implementation of the standard i-vector-PLDA framework for the Kaldi speech recognition toolkit. The current existing speaker recognition system implementation is based on the Subspace Gaussian Mixture Model (SGMM) technique although it shares many similarities with the standard implementation. In our implementation, we modified the code so that it mimics the standard algorithms in the ivector based speaker recognition system. The implementation is compared with the existing Kaldi recipe for speaker recognition on the NIST SRE 2008 evaluation set. The entire implementation is made available at https://github.com/idiap/kaldi-ivector under the Apache 2.0 license.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tper Hcaeser Pidi Implementation of Vtln for Statistical Speech Synthesis

Vocal tract length normalization is an important feature normalization technique that can be used to perform speaker adaptation when very little adaptation data is available. It was shown earlier that VTLN can be applied to statistical speech synthesis and was shown to give additive improvements to CMLLR. This paper presents an EM optimization for estimating more accurate warping factors. The E...

متن کامل

Tper Hcaeser Pidi Application of Out-of-language Detection to Spoken-term Detection

This paper investigates the detection of English spoken terms in a conversational multi-language scenario. The speech is processed using a large vocabulary continuous speech recognition system. The recognition output is represented in the form of word recognition lattices which are then used to search required terms. Due to the potential multi-lingual speech segments at the input, the spoken te...

متن کامل

Tper Hcaeser Pidi Application of Subspace Gaussian Mixture Models in Contrastive Acoustic Scenarios

This paper describes experimental results of applying Subspace Gaussian Mixture Models (SGMMs) in two completely diverse acoustic scenarios: (a) for Large Vocabulary Continuous Speech Recognition (LVCSR) task over (well-resourced) English meeting data and, (b) for acoustic modeling of underresourced Afrikaans telephone data. In both cases, the performance of SGMM models is compared with a conve...

متن کامل

پایه‌گذاری بستری نو و کارآمد در حوزه بازشناسی گفتار فارسی

Although researches in the field of Persian speech recognition claim a thirty-year-old history in Iran which has achieved considerable progresses, due to the lack of well-defined experimental framework, outcomes from many of these researches are not comparable to each other and their accurate assessment won’t be possible. The experimental framework includes ASR toolkit and speech database ...

متن کامل

Implementation of a Radiology Speech Recognition System for Estonian Using Open Source Software

Speech recognition has become increasingly popular in radiology reporting in the last decade. However, developing a speech recognition system for a new language in a highly specific domain requires a lot of resources, expert knowledge and skills. Therefore, commercial vendors do not offer ready-made radiology speech recognition systems for less-resourced languages. This paper describes the impl...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Tper Hcaeser Pidi Implementation of the Standard I-vector System for the Kaldi Speech Recognition Toolkit

نویسندگان

چکیده

منابع مشابه

Tper Hcaeser Pidi Implementation of Vtln for Statistical Speech Synthesis

Tper Hcaeser Pidi Application of Out-of-language Detection to Spoken-term Detection

Tper Hcaeser Pidi Application of Subspace Gaussian Mixture Models in Contrastive Acoustic Scenarios

پایه‌گذاری بستری نو و کارآمد در حوزه بازشناسی گفتار فارسی

Implementation of a Radiology Speech Recognition System for Estonian Using Open Source Software

عنوان ژورنال:

اشتراک گذاری